Acyclic Subgraph based Descriptor Spaces for Chemical Compound Retrieval and Classification

نویسندگان

  • Nikil Wale
  • George Karypis
چکیده

In recent years the development of computational techniques that build models to correctly assign chemical compounds to various classes or to retrieve potential drug-like compounds has been an active area of research. These techniques are used extensively at various phases during the drug development process. Many of the best-performing techniques for these tasks, utilize a descriptor-based representation of the compound that captures various aspects of the underlying molecular graph’s topology. In this paper we introduce and describe algorithms for efficiently generating a new set of descriptors that are derived from all connected acyclic fragments present in the molecular graphs. In addition, we introduce an extension to existing vector-based kernel functions to take into account the length of the fragments present in the descriptors. We experimentally evaluate the performance of the new descriptors in the context of SVM-based classification and ranked-retrieval on 28 classification and retrieval problems derived from 17 datasets. Our experiments show that for both the classification and retrieval tasks, these new descriptors consistently and statistically outperform previously developed schemes based on the widely used fingerprintand Maccs keys-based descriptors, as well as recently introduced descriptors obtained by mining and analyzing the structure of the molecular graphs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification and properties of acyclic discrete phase-type distributions based on geometric and shifted geometric distributions

Acyclic phase-type distributions form a versatile model, serving as approximations to many probability distributions in various circumstances. They exhibit special properties and characteristics that usually make their applications attractive. Compared to acyclic continuous phase-type (ACPH) distributions, acyclic discrete phase-type (ADPH) distributions and their subclasses (ADPH family) have ...

متن کامل

بازیابی مبتنی بر شکل اجسام با توصیفگرهای بدست آمده از فرآیند رشد کانتوری

In this paper, a novel shape descriptor for shape-based object retrieval is proposed. A growing process is introduced in which a contour is reconstructed from the bounding circle of the shape. In this growing process, circle points move toward the shape in normal direction until they  get to the shape contour. Three different shape descriptors are extracted from this process: the first descript...

متن کامل

MULTI CLASS BRAIN TUMOR CLASSIFICATION OF MRI IMAGES USING HYBRID STRUCTURE DESCRIPTOR AND FUZZY LOGIC BASED RBF KERNEL SVM

Medical Image segmentation is to partition the image into a set of regions that are visually obvious and consistent with respect to some properties such as gray level, texture or color. Brain tumor classification is an imperative and difficult task in cancer radiotherapy. The objective of this research is to examine the use of pattern classification methods for distinguishing different types of...

متن کامل

Texture descriptor based on local combination adaptive ternary pattern

Material recognition has several applications, such as image retrieval, object recognition and robotic manipulation. To make the material classification more suitable for real-world applications, it is fundamental to satisfy two characteristics: robustness to scale and to pose variations. In this study, the authors propose a novel discriminant descriptor for texture classification based on a ne...

متن کامل

A new approach to compute acyclic chromatic index of certain chemical structures

An acyclic edge coloring of a graph is a proper edge coloring such that there are no bichromatic cycles. The acyclic chromatic index of a graph $G$ denoted by $chi_a '(G)$ is the minimum number $k$ such that there is an acyclic edge coloring using $k$ colors. The maximum degree in $G$ denoted by $Delta(G)$, is the lower bound for $chi_a '(G)$. $P$-cuts introduced in this paper acts as a powerfu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006